Corpus: ido_wikipedia_2007_30K

Other corpora

4.4.1.5 Number of Word-N-grams at Sentence Endings

Number of word-N-grams for N=1...5 for the first K sentences

K # of words # of bigrams # of trigrams # of 4-grams # of 5-grams
100 74 78 85 86 86
1000 569 631 753 760 765
10000 4834 5792 7020 7136 7335
100000 12201 15897 19200 19858 20744
1000000 12201 15897 19200 19858 20744


Zipf's diagram for sentence endings


Gnuplot diagram

3138 msec needed at 2017-12-24 17:39